Night-time Saftey Index

Scenario

Melbourne is a vibrnt city with a 24h economy, as of a result transport authorities, city planners and concerned community members are increasingly worried about the night time saftey of the city. While melbourne has a thriving culture with many events, hospitality, shopping and public transport running late into the night. Not all parts of Melbourne is safe with the increased number of inccidents of crime occuring in city particularly in isolated walkways and streets with poor lighting. This raises concern in the community for the youth, elderly, women, students and shift workers alike. To address this problem, I have used the data from City of Melbourne open datasets. The implemented datasets include: - Street lighting - Pedestrian foot traffic - Public Transport stop locations.

By developing the Night time saftey Index, I aim to identify areas within melbourne that are considered high risk zones. Areas in which there is low visibility, little to no street lights, little foot traffic and is not near a public transport stop. Conversly areas will be mared as safer if there is higher visibility, more foot traffic as this acts as a percieved layer of saftey and access to public transport, as generally the likleyhood of people would be higher. The index is developed based on understanding crime data on low light and low populas areas.

With the use of the Night-time saftey index, we can strive to improve these areas improve the saftey index and build a more safe and inclusive city for all members of the community.

As a student who commutes to the city regularly for work and univesity, I end up often traveling after dark. I want to know which areas in melbourne are safe or dangerous at night. So that I can plan a safer routes, avoid areas that are dangerous and feel confident that I will be safe moving through the city at night.

What this use case will teach you

At the end of this use case you will:

  • Demonstraite the ability to retrieve and process data from public API
  • Data cleaning and preprocessing techniques on geospatial and time data
  • Perform basic aggregation and filtering methods
  • Perform analysis using latitude and longitude data
  • Implement data visualisation techniques

Introduction¶

This use case aims to develop a Night time saftey index by combining datasets from the City of Melbourne Open data project. By combing the data, it can be then assessed and visualised to demonstrate the varying saftey level for different areas during the night hours. The project goal is to disover areas with low visablity, low foot traffic or infrastructure to aid in future developmenet to create a safer melbourne for all. The analysis is drawn from the datasets below accessed via the Melbourne open data API. By combining combining these datasets the project aims to create a data driven solution to create a safer and inclusive city.

Dataset Links

  • Bus Stops data link: https://data.melbourne.vic.gov.au/explore/dataset/bus-stops/api/
  • Street Lighting data link: https://data.melbourne.vic.gov.au/explore/dataset/street-lights-with-emitted-lux-level-council-owned-lights-only/api/
  • Pedestrian counting link: https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour/api/
  • Feautured light data link: https://data.melbourne.vic.gov.au/explore/dataset/feature-lighting-including-light-type-wattage-and-location/api/

Geospatial Visualisation of Street Lights¶

Description:

The scatter plot below plots the geographical distribution of street lights across the mapped area. Each point represents a single street light. Visualising provides a spatial context of the spread of the data and aids in discovering the light coverage and potential areas which are underlit.


No description has been provided for this image

Overview of Geospatial visualisation:

The scatter plot has provided key insights into the datasets in particular the spread of the light datasets. As showen there are key areas that are missing that should be present when overlayed on a map of the city of melbourne. This takes into question the spread of the data what datasets are missing and what steps we cant take to account for the significant missing street light data.

Importing feature light data¶

The data is fetched using the API and then the rows are renamed to latitude and longitude to keep consistant nameing porfile throughout all datasets.Enabling seamless geospatial analysis and integration with other location-based data


asset_number asset_description lamp_type_lupvalue lamp_rating_w mounting_type_lupvalue latitude longitude location
0 1544260 Feature Lighting - Birrarung Marr 13.0 70.0 Pole: Multiple Fixed -37.818239 144.971382 {'lon': 144.9713815748613, 'lat': -37.81823859...
1 1541782 Feature Lighting - 13.0 35.0 Pole: Multiple Fixed -37.822848 144.947094 {'lon': 144.94709354140863, 'lat': -37.8228478...
2 1542772 Feature Lighting - 12.0 NaN Pole: Multiple Fixed -37.823150 144.947204 {'lon': 144.9472041813461, 'lat': -37.82314998...
3 1346470 Feature Lighting - Docklands 1.0 NaN Canopy -37.817318 144.952251 {'lon': 144.95225109118593, 'lat': -37.8173181...
4 1539337 Feature Lighting - Newquay Promenade between S... 9.0 NaN Pole: Multiple Fixed -37.814603 144.942694 {'lon': 144.94269431917522, 'lat': -37.8146026...
... ... ... ... ... ... ... ... ...
8559 1347738 Feature Lighting - Docklands 1.0 18.0 Wall -37.824620 144.946620 {'lon': 144.94662008927858, 'lat': -37.8246201...
8560 1541845 Feature Lighting - NaN NaN Pole: Multiple Fixed -37.823748 144.952091 {'lon': 144.9520910323129, 'lat': -37.82374757...
8561 1346811 Feature Lighting - Docklands 3.0 36.0 Parapet -37.817528 144.950016 {'lon': 144.95001579629215, 'lat': -37.8175284...
8562 1544683 Feature Lighting - Seafarers Rest 2.0 14.0 Pole: Multiple Fixed -37.822771 144.951655 {'lon': 144.95165526864113, 'lat': -37.8227706...
8563 1542075 Feature Lighting - Arglye Square 9.0 NaN Pole: Multiple Fixed -37.802565 144.966134 {'lon': 144.9661338329207, 'lat': -37.80256495...

8564 rows × 8 columns

Visualisation of Feature light's distribution of Lamp Wattage¶

Description:

The bar plot below visualises the distribution of lampwattage this gives us a visual representage of the light wattage. This will aid in deciding on the weightage of the safety score based on the light wattage.


No description has been provided for this image

Wattage Distribution Insight¶

Observation:
The majority of street lights in the feature_light_df dataset fall within the 0–100 watt range, with a few number exceeding 400 watts. With the distribution skewed to the left it suggests a intentional design strategy in urban lighting infrastructure assuming it is based on the density of the location.

Interpretation:
The relatively low wattage per individual light may reflect a higher density based method with a higher concentration of lights, as each light require less power to produce sufficient illumination. This enables: * Better energy efficiency * Reduced light pollution *Even light distribution across high-density areas

Implication for Analysis:
When evaluating visibility within an area, it is essential to taken into concideration both individual lamp wattage and the spatial density of lighting.

Initialising safety_score for feature_light_df and Cleaning¶

Description:

  • The safety score is attatched to the feature_light_df with the assignement based on the lamp_rating_w this checkes the range of the wattage of each light and then assings a safety score.

  • As we can see below if the light is below 50w the rating is 1, if the light is below 100 we get a safety rating of 2, if the light is below 300w we get a saftey rating of 3 and for anything above we get a safety rating of 4. If there is a missing value we return 1 for the rating of the light. As it is most likely to be that of 50w or less.

  • Finally we can see below that .drop() removes columns that are not used within the dataset so that it can be cleaned and be more clear.

Geospatial Visualisation of Pedestrian Counting sensors¶

Description:

-The code cell below plots all the locaitons of all sensors using matplotlib. It demonstrates the spread of the sensors throughout Melbourne


No description has been provided for this image

As seen with the results we cans see that there is a good spread of pedestrian count sensors throought the city. This is great news as it will ensure that we can reliably use the sensor data.

Visualisation of Pedestrian Counts by Location¶

Description:

-The code cell below plots a bar plot of all the sensor locaitons using matplotlib. This will provide us with the visual representation of the more frequented locations within melbourne.


No description has been provided for this image

From the Graph:

  • we can see the location id in which there is the greatest foot traffic present
  • we can also see the locations in which the least amount of foot traffic exists
  • some stand out location id are ID 35, 24, 41,47,59, 66, 84 for the largest amount of foot traffic in either direction
  • low foot traffic regions can be see as ID 10, 44, 46, 51, 71, 75, 76, 78, 118 etc

Importing Landmark Data¶

Several key tasks are completed to prepare landmark dataset for future analysis. Firstly the Dataset is retrieved it is imported using the Melbourne Open Data v2.1 API, ensuring all the data is up to date. The first steps taken are Coordinate Extraction. In which the lat and lon data in a dictionary format new coloumns are created with longitude and latitude both extract from co_ordinates. The data is validated so that is ensured that is clean and ready for mapping. The rows are checked for if co_ordinates is a valid dcitionary. Finally a prieview of the code is created with .head(). unique landmarks is printed to see the different landmarks which are incorrportate into the dataset. implemented describe to understand the dataset for any EDA.

The EDA provides useful information as the number of unique landmarks which exist this will provide useful when implementing the adequate saftey score based on the landmark. Furthermore, we get a good scope into the data there are 242 data points. These datapoints will provide useful to help with the lack of street light data within the city.

Visualising Landmarks¶

Viewing the landmarks on the map gives a good respresentation of the provided landmarks and the locations in which each land marks exist. The map contains unique Icons for each landmark and if you click on the icon it provides the description of the landmark.

 
Make this Notebook Trusted to load map: File -> Trust Notebook

What can we see¶

  • The spread of the key landmarks within melbourne
  • What landmarks exisit within the city
  • Better understanding on what steps to take to assign landmark score values

Landmark score¶

Below the landmark saftey score is assigned to each of the key landmarks within the dataset. The scoring is determined by the type of land use and its impact on pedestrian activity, lighting and surveillance. Locations such as health services, education centres and community hubs have received an increased score due to the frequent foot traffic and established infrastructure present in those locations. On the other hand locations such as vacant lands, warehouses and industrial zones are assined lower scores due to lower pedestrian activity, reduced visibility and underutilisation. Furthermore these areas during the night are often not visited and remain empty. This scoring enables us to quantify various landmarks and how they contribute or detract from the night time safety in Melbourne.

Visualisation of Safety Score Distribution¶

Description: The code cell implements a Bar graph of the Saftey Score for each landmark, with a clear visual representation of the spread of the data based ont he landmark.

 
No description has been provided for this image

Areas that are low value are Warehouse/store and industrial with a value of 0. With the only negative score assigned to vacant land.

Geospatial Visualisation of Bus Stops¶

Description: The code cell below plots all the location of the bus stops within the bus_stops_df as a scatter plot. This will server as a visual representation of the spread of the data and aid in decisions based on the bus stop datasets.


No description has been provided for this image

As seen by the scatter plot the spread demonstrates how well distributed the bus stop data is.

Visualising Dataset using folium¶

I have iterated through each of the datasets and coloured the elements respectfully blue: bus stops, red: pedestrian count sensor location, yellow: street lights. This provides a visual guid to understand the location of each of the street lights, location of each of the bus stops and the location of each of the pedestrain sensors with this informaiton we can get a better understanding on how to further progress on the project what data is scarse and what areas need more data sources to produce reliable information.


Make this Notebook Trusted to load map: File -> Trust Notebook

Visualisation Summary of Spatial Data¶

The output generated is an interactive map of the Melbourne CBD featuring:

  • Yellow dots: Locations of council-owned street lights
  • Blue dots: Bus stop locations
  • Red dots: Pedestrian sensor locations

From the map, several key insights can be drawn:

  • Street light coverage is uneven across the city. Lighting is heavily concentrated in certain zones while some areas are noticeably lacking.
  • Pedestrian sensors are installed only at specific points, which means that pedestrian traffic data is not uniformly available city-wide.
  • This limitation highlights the need for extrapolation and data augmentation in areas lacking direct measurements .
  • Visualising this spatial data has proven to be highly informative it provides critical context to guide the design of a more accurate Night-Time Safety Index.

The visualisation acts as both a planning tool to better understand urban safety infrastructure coverage.

Grouping by Latitude and Longitude Grid Bins¶

This section of the notebook prepares all of the datasets for merging and incorporates the street light density into the Night-Time Safety Index calclualtion.

Description:

  1. Safety Score Renaming
  • Every dataset's safety_score has to changed to an appropriate unique identifier so that the data may be merged into a singular dataset.
  1. Location Binning
  • All datasets have been allocated latitude and longitude bins of precision 4, which is a rounding precision which works out to 11 meters. The spatial bins will aid in the implementation of merging features as per there proximity. The location binning has also aided in the impplementation of the light density.
  1. Light Density
  • light density was implemented to account for the amount of light within a location bin if the weightage of the light was low but there are many lights it would account for the low wattage. To implement this
  • I have first combined both light datasets into one. This allows for there to be one set of lat and lon bin. I then grouped the data by lat_bin and long_bin by size as there would be in some cases overlapping light data within the same location bin. I then put the size into a new column light density.
  • The light data was then normalised so that it would not skew the data heavily in favour of light density as some locations had high light density. Light density safety score would not exceed a value of 4 with a min of 0 to keep it in range with other safety scores.
  1. Safety Feature Consolidation
  • A list of simplified dataframes is created, each containing lat_bin, lon_bin, and its respective safety-related score. These will later be merged to compute the final safety score per grid cell.

EDA on final_safety_score¶

Exploritory data analysis is done on the final safety score to ensure the spread of values is reasonalbe and so that there are no extreme outliers. This ensures that the final safety score has functioned as intended and the results have been successful.

 
count    20573.000000
mean         0.482831
std          0.191851
min         -0.142857
25%          0.285714
50%          0.428571
75%          0.571429
max          1.285714
Name: final_safety_score, dtype: float64

Looking at the results we can see that there are 20573 location grids, with a mean final_safety_score: 0.48 this further dispalays the spread of the data is well distributed. We can see the max final safety score is 1.29 and the min is -0.14.

Visualisation on the Distribution of Final Safety Score¶

The cell below provides a histogram of the distribution of values so that is to ensure the spread is even.

 
No description has been provided for this image

The spread of the data is well distributed and reflects the results obtained earlier.

Heatmap Visualisation of Final Safety Score¶

This section generates an interactive heatmap to visualise the calculated final_safety_score across Melbournes city.

Description:¶

  • A Folium map is created, centered on Melbourne's CBD.
  • The heat_data list extracts the lat_bin, lon_bin, and corresponding final_safety_score from the combined_df dataframe.
  • A heat layer is added using folium.plugins.HeatMap(), where:
    • Each point on the map corresponds to a geographic bin (rounded lat/lon).
    • The intensity of the heat is based on the final_safety_score, highlighting areas of varying night-time safety.
    • Higher scores produce a "hotter" (brighter) color, typically representing safer areas.

This heatmap provides an intuitive geospatial understanding of the night time safety index.


Make this Notebook Trusted to load map: File -> Trust Notebook

Explanation¶

The heat map provides a clear outline of areas of which that are considered safe, As seen these locations are marked in red and and the darker blue areas indicate areas that are less safe. The grid doesnt cover the whole of melbourne as seen by the map but provides a good overview of location that do cover it regions not covered dont have any colouring present. We can see that the main regions in which the heatmap indicates are safe are near trainstations is working as expected, we can see a clearly see the areas which are red and busy higher traffic areas. The Docklands area performs significantly well as the featured light data is present within that location, as well as the density of light within that region has awarded it a higher value.

Conclusion: Night Time Safety Index¶

The Night Time Safety Index provides a data-driven solution to evaluating perceived and infrastructural safety across Melbourne during nighttime hours. By compiling multiple open datasets including pedestrian traffic, featured lighting, council owned street lighting, bus stop, city circle tram stops and public landmarks constructed a geospatial index that captures both activity levels and environmental safety factors.

Through feature engineering and location grid binning, I was able to normalize datasets and generate a combined safety score for each location grid. The resulting heatmap clearly highlights areas of concern, as well of areas in which good industrial development has taken palce and which are safer.

The index is not only limited to use for individuals who are travelling at night and would like to avoid areas which are dangerous or individuals who are trying to get home safely. It provides an insight into safety within the community to city planners and local authorities in determining the gaps in safety. There is still so much to explore with the night time safety index. When developing the index I had a large part of the city lighting data missing, as of a resault I had contacted City of Melbourne to find access to the light data, but was unfortunatly unable to recieve the data in time. With the completed dataset it will provide a more accurate night time safety index further improving the safety index.

This index serves as a foundation for future enhancements such as incorporating real-time incident reports or machine learning models to predict safety fluctuations. Finally, the Night Time Safety Index is a scalable framework that promotes smarter, safer urban environments through transparent, open data analytics.